Discrete Fourier analysis for phylogenetic trees
نویسنده
چکیده
Discrete Fourier transformations (DFTs) provide a useful tool to assign a phylogenetic tree (PGT) to an observed frequency of nucleotide patterns in DNA sequences of species. The advantage of this sort of spectral analysis is that it allows global correction for multi-substitution processes [1]. SPECTRAL ANALYSIS OF PGTS Two spectras characterize a PGT, the probability spectrum p(T), and the expected sequence spectrum s(T). After labelling the edges of the tree in an appropriate way, they are can be related by two steps of transforms using vector functions called Hadamard conjugations. The intermediate vector is called the edge length spectrum. The transformation scheme is given in Fig. 1. This scheme can be used in two ways. Starting with a probability distribution we can calculate the edge length spectrum and the expected sequence spectrum. On the other hand, given a data set D, we can take the observed sequence spectrum s(D) (the relative frequencies of character patterns) as an estimate for s(T). From this we calculate a conjugate spectrum γ(D) (the ‘corrected partition frequencies’) [1, 4]. This will correct for all parallel, multiple, and higher order substitutions. We find the corresponding tree, that is the tree for which | γ(D) – q(T) | is minimal, using a fitting algorithm (e.g. least-squares best fit or ‘closest tree algorithm’). Having found the correct tree one is able to reconstruct the probability spectrum and expected sequence spectrum. HADAMARD CONJUGATION A conjugation consists of three transformations that are successively applied. The third transformation is the inverse of the first. The m m – Hardamard matrix Ht is defined as (Hendy et al., Proc. Natl. Acad. Sci. USA 91 (1994)) Fig.1. Scheme of transformations.
منابع مشابه
Application of discrete Fourier inter-coefficient difference for assessing genetic sequence similarity
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel e...
متن کاملFourier Analysis and Phylogenetic Trees
We give an overview of phylogenetic invariants: a technique for reconstructing evolutionary family trees from DNA sequence data. This method is useful in practice and is based on a number of simple ideas from elementary group theory, probability, linear algebra, and commutative algebra.
متن کاملFourier analysis on finite Abelian groups: some graphical applications
A survey of basic techniques of Fourier analysis on a finite Abelian group Q with subsequent applications in graph theory. In particular, evaluations of the Tutte polynomial of a graph G in terms of cosets of the Q-flows (or dually Q-tensions) of G. Other applications to spanning trees of Cayley graphs and group-valued models on phylogenetic trees are also used to illustrate methods.
متن کاملQuantitative Comparison of Tree Pairs Resulted from Gene and Protein Phylogenetic Trees for Sulfite Reductase Flavoprotein Alpha-Component and 5S rRNA and Taxonomic Trees in Selected Bacterial Species
Introduction: FAD is the cofactor of FAD-FR protein family. Sulfite reductase flavoprotein alpha-component is one of the main enzymes of this family. Based on applications of this enzyme in biotechnology and industry, it was chosen as the subject of evolutionary studies in 19 specific species. Method: Gene and protein sequences of sulfite reductase flavoprotein alpha-component, 5S rRNA sequence...
متن کاملGenerating functions for multi-labeled trees
Multi-labeled trees are a generalization of phylogenetic trees that are used, for example, in the study of gene versus species evolution and as the basis for phylogenetic network construction. Unlike phylogenetic trees, in a leaf-multi-labeled tree it is possible to label more than one leaf by the same element of the underlying label set. In this paper we derive formulae for generating function...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001